Speech compression device



Feb. 10, 1970 R. H. JOHNSTON 3,495,042

SPEECH COMPRESSION DEVICE Filed May 22. 19s? A CATHODE I RAY 28/ TUBE INT.

AUDIO OUT RECORDER F l G l l INVENTOR REED H. JOHNSTON BY ATT RNEY United States Patent 3,495,042 SPEECH COMPRESSYON DEVICE Reed H. Johnston, Weilesiey, Mass., assignor to Arthur D. Little, Inc., Cambridge, Mass., a corporation of Massachusetts Filed May 22, 1967, Ser. No. 640,298 Int. Cl. HtMb 1/66 U.S. Cl. 17915.55 8 Claims ABSTRACT OF THE DISCLOSURE A speech compression device for making intelligible information available in a shorter than normal time period, which device includes a recording system in which the message is recorded and from which the message can be played out at an increased rate of speed, a cathode ray tube and a circuit for alternately writing a pair of concentric circular traces of different diameters on the phosphor screen of the tube, the audio message from the recording device being used to intensity modulate the traces as they are formed. A mask across the face of the tube blanks off one-half of the traces. An optical system is included for reading out the information on the traces, which system includes a pair of lenses rotatable about a common axis, each lens imaging an unmasked portion of a corresponding trace to a common focal point on that axis. A photoelectric detector is positioned to be responsive to the light imaged to the focal point, for translating the light into electrical signals. The angular velocities of the lens system and of the electron beam forming the traces are unequal. The lens system and electron beam angular velocities are synchronised to one another by being locked to a common AC line frequency.

This invention relates to time-compression of audio signals such as speech and particularly to apparatus for increasing the rate of information transfer without sensibly altering the intelligibility of the information.

Information in speech is frequently delivered or recorded at a rate considerably less than the ability of a listener to comprehend the information. If the recorded speech is played back at an increased rate, the information is conveyed more quickly but the intelligibility is severely degraded by the changes in pitch.

This degradation has been investigated by Fairbanks, Guttman and Miron (Journal of Speech and Hearing Defects, 22, 10-19, 1957) who found that if the voice is reproduced at normal pitch, the intelligibility of the message is not seriously degraded by increased playback rates provided that the time for playback is not less than about half that required for recording.

A number of techniques for attaining time-compressed speech have been proposed; unfortunately they tend to be quite complex and expensive and often involve special recording devices.

A principal object of the present invention is to provide means for time-compressing speech which is quite simple, relatively inexpensive and can be used as an adjunct to ordinary recording systems.

These and other objects are generally effected by a device for processing a message containing signals within an audio range and produced during a given time interval, which device includes means for reducing the time interval of the message such that the average frequency of the signals is increased. Means are also provided for time-sampling the reduced-interval message at a predetermined sampling rate. Means are then employed for increasing the duration of the samples so as to reduce the average signal frequency to its original value, the samples of increased duration then being combined to form an "ice uninterrupted sequence in which the message signals are at their original frequency but the message information will be substantially available in a shorter time interval than the original time interval.

Generally, a preferred means for reducing the time interval involves the use of a temporary storage medium capable of storing information at two separate locations such as two recording tracks, and a recording input means, the storage medium and input means being movable relative to one another at a Speed sufficient to store the message into a comparatively short period of time. Thus, the means for sampling includes two read-out devices, one for each track, operated alternately. The tracks and read-out devices are movable relative to one another at a speed lesser than the relative speed of the tracks and the recording input means; thus, the duration of each sample provided by the read-out devices, is increased. The read-out devices are operated with minimum delay between their respective functions, thereby combining the samples into the requisite uninterrupted sequence. Preferably, recording of the same signals occurs at positions displaced from an operative read-out device so that the input signal is only fed to the read-out device after being in temporary storage.

Other objects of the invention will in part be obvious and will in part appear hereinafter. The invention accordingly comprises the apparatus possessing the construction, combination of elements, and arrangement of parts which are exemplified in the following detailed disclosure, and the scope of the application of which will be indicated in the claims.

For a fuller understanding of the nature and objects of the present invention, reference should be had to the following detailed description taken in connection with the accompanying drawings wherein:

FIG. 1 is a schematic diagram partly in block form of one embodiment of the invention; and

FIG. 2 is an end view of the storage device of FIG. 1.

Referring now to FIG. 1, there is shown one form of the invention comprising means for reducing the time interval of an input message by a factor of n so as to increase the average audio frequencies of the message by that factor, and shown typically at a well known tape recorder 18 which is capable of recording a message at a given speed and of reproducing the message at a higher speed.

Means for time sampling the speeded-up message is formed of cathode ray tube (CRT) 20 of the type having four input terminals including terminal 22 which connects with the internal intensity control such as a grid or cathode of the CRT, and is connected to recorder 18 or other source of the interval-shortened message (in electrical waveform) which is to be time-compressed. The other three terminals of the CRT are respectively connected to the usual four deflection plates such that one pair of orthogonal plates (X, Y) are connected together and to terminal 24, the other two plates (X and Y being electrically separated and connected respectively to terminals 26 and 28;

Means are included for establishing two locations at which the data impressed at terminal 22 can be stored. To this end, a cyclic or AC signal source 30 is provided, being connected across primary winding 32 of transformer 34. The latter has a center-tapped secondary winding 36 (or two seriesconnected windings). The center tap 38 is connected to X, Y, terminal 24 of the CRT. Terminal 26 is connected through capacitor 40 to center tap 38, and also through resistor 42 to one end of secondary winding 36. Terminal 28 is connected through resistor 44 to center tap 38, and also through capacitor 46 to the other end of winding 36.

Preferably resistors 42 and 44 are not of equal value, and capacitors 40 and 46 are not of equal capacitance because of the difference in sensitivity usual for the X and Y deflection plates. A load resistor 48 is in series between source 30 and primary winding 32. In parallel with one another and with resistor 48 are first and second controlled rectifiers 50 and 52, connected anode-tocathode. The gates of the rectifiers are tied together and connected to the output of means, such as multivibrator 54, for controlling the rectifiers at half the frequency of the cyclic signal from source 30. Multivibrator 54 is thus connected to source 30 so that the multivibrator output signals are frequency locked to the AC source. Obviously, then multivibrator 54 operates as a divide-by-two circuit.

It will be seen that the AC signal supplied by source 30, typically a 60 cycle line signal, will be applied to the plates of CRT 20 with a quadrature phase shift due to the capacitor-resistor network coupling between the plates and secondary winding 36. Thus, with appropriate choice of resistors 42 and 44 and capacitors 40 and 46, the trace appearing on the face of the CRT will be circular. However, multivibrator 54 puts out a signal only during alternate full cycles. Thus, for each alternate cycle, both rectifiers 50 and 52 are gated into conduction, and a full cycle of the signal from source 30 is shunted around resistor 48 by the rectifiers. When the rectifiers are both non-conductive, the power must pass through resistor 48 reducing the voltage across primary winding 32. Therefore, successive cycles produce circular scans 56 and 58, as shown in FIG. 2, of alternately larger and smaller radii.

Because of the connection to terminal 22, the intensity of the traces or scans is modulated so that variations in illumination reproduce the input audio signal. The face of the CRT will exhibit therefore a trace executing one complete revolution in second, containing information received over that time interval, and successive revolutions will contain information alternately on scans 56 and 58.

In order to time sample the information presented by the CRT, the face of the latter is covered by mask 60 (shown in plan in FIG. 2), part of which completely occludes one-half of each trace, typically by a semicircular portion 61. Means are further provided for alternately interrogating or reading out inner and outer traces 58 and 60, where the latter are not completely masked, in such a manner that the read-out of each semicircular trace portion is accomplished in the same second interval required to produce the full circular trace, and without any delay or dwell between the read-out of alternate traces. To this end, there is provided motor 62, preferably a synchronous type, connected across source 30 so as to be rotatable at a rate which is a submultiple of the frequency of the signal from AC source. For example, where the frequency of the signal from source 30 is 60 cyc1es/sec., motor 62 can rotate synchronously at 1800 rpm, which is one-half of the AC line frequency.

In the form shown, shaft 64 of motor 62 is hollow and is filled with a transparent light pipe, typically of clear synthetic polymeric material such as polymethylmethacrylate or the like. One end of the shaft carries an opaque covering with a slit 66, e.g. about 0.010" in width.

Mounted on shaft 64 adjacent slit 66 is lens mount 68 carrying a pair of lenses 70 and 72 on opposite sides of the axis of rotation of the shaft and positioned for respectively imaging traces 56 and 58 onto slit 66. Thus, as motor 62 rotates, lens 70 will image trace 56 onto slit 66 while lens 72 is blanked because it faces mask 60. As rotation continues, lens 70 moves to face mask 60 just as lens 72 is brought into position to image trace 58.

Mounted adjacent the other end of shaft 64 so as to be exposed to light transmitted through slit 66 and the light pipe is a transducer such as photomultiplier 74.

The latter converts light received into an electrical signal which is then amplified in amplifier 76. The output of the latter is then the desired audio signal which can be fed to a recorder, loudspeaker or the like. The power sources (not shown) for the amplifier, the photomultiplier and the CRT can all be in common if desired.

Light emission from the phosphors of the CRT, of course, decays in time following energization. Hence, while one-half of mask is opaque, the other semicircular portion 78 of the mask is in the form of a gray wedge (typically of the type used in photography) through which the transmission varies according preferably as the inverse of the phosphor decay time function. If the traces are swept out clockwise as shown by the arrow in FIG. 2, the transmissiveness of gray wedge portion 78, along the track of the traces varies from maximum to minimum in the same direction. Thus, equal amplitude input signals to terminal 22 will give rise to equal light intensities as viewed through portion 78, regardless of where written on the semicircular portion of the phosphor of the CRT.

There are, of course, a large number of phosphors of varying decay times that can be used. For example, the electron beam writing out a trace can be adjusted to write 157.5 degrees ahead of the mechanical scan at the beginning of each read-out cycle by the appropriate lens position, and a P phosphor is then used. This particular choice requires a modest density range for the grey wedge portion of the mask because the transmission range is less than a factor of two.

Delayed signals resulting from phosphorescent persistence will tend to arrive after delays of about 33 and 67 milliseconds for the embodiment described, but will have energies of about 16.5 and 28 decibels below the main signal. These can readily be suppressed by sampling at second rather than ,4 second, as by mounting the lenses so that they are displaced by 90 rather than The light pipe would then be fitted with a pair of noninteracting slits rather than a single slit, and the phosphor changed to a P11 type. Lastly, the mask would blank off opposite quadrants rather than semicircules of the CRT face and the gray wedge would be adjusted for the P11 phosphor. In this instance, the echo or delayed signal would lie between about 31 to 49 decibels below the regular signal.

In operation, typically a 60 cycle signal is provided by source 30 and applied to primary winding 32. This induces signals in both secondary windings 36 and 38 which signals should be provided with amplitudes inversely proportional to the X and Y deflection sensitivities and 90 apart in phase by adjustment of capacitors 40 and 46 and resistors 42 and 44. The application of the two sinusoidal voltages from the secondaries provides sweep voltages that move the electron beam in tube 20 in a circular path.

As previously noted, alternate full cycles of the input signal from source 30 are put through a different impedance in series with primary winding 32, resulting in alternately larger and smaller circular traces being swept out on the face of tube 20. Audio signals applied at terminal 22 will amplitude modulate the brightness of these alternate concentric traces. Consequently, an audio message of given length will appear for sec. as the modulation of one whole circular trace, the next ,6 sec. portion of the message appearing on the other circular trace. The

audio message applied at terminal 22 typically is derived from a record played into terminal 22 by recorder 18 at twice the speed of the original recording. Hence, if M is the original message time-length at an average frequency 1, then the signal at terminal 22 is applied in an interval of M/2 with an average frequency of ZF because the message time interval has been reduced by a factor of 2.

Motor 62, rotating at 1800 r.p.m., sweeps lens 72 about a full circule in & sec. and therefore reads whatever appears on half of trace 56 left exposed by mask 60 in ,6

sec. Similarly, lens 70 views half of trace 58 in the next sec. Thus, one-half of the message on each trace is discarded by mask 60 and one-half read. The use of a half-masking technique thus insures that the portions of the traces displayed effectively are time sampled at a rate w/t where t is unit time such as a second, and w is the number of time in each second that the message is available for viewing, i.e. 60. The duration of each sample when written into temporary storage is then t/nw or second which should be greater than the average period (or the reciprocal frequency) of the audio signals being displayed, i.e. temporarily stored. This sample duration is effectively increased because the angular rotational speed of the lenses is half (i.e. divided by n) the angular rotational speed at which the electron beam lays down the traces in tube 20.

Alternatively, one can form the apparatus of the present invention from other elements. For example, the temporary storage element of the invention has been described as a cathode ray tube. In place of a cathode ray tube one can employ a rotatable recording disk having for example a magnetic coating. In such case, read-out, instead of being optical as described, would be accomplished by standard magnetic reproduction heads. A disk should preferably be quite thin so that recording can be accomplished on one side and reading on the other.

Since certain changes may be made in the above ap paratus without departing from the scope of the invention herein involved it is intended that all matter contained in the above description or shown in the accompanying drawing shall be interpreted in an illustrative and not in a limiting sense.

What is claimed is:

1. Apparatus for time-compressing information in a message containing signals within an audio range and produced during a given time interval, said apparatus comprising:

means for reducing the time interval of said message by a factor of n so as to increase the average frequencies of said signals by said factor;

means for time-sampling the reduced-time message at a sampling rate w/t so as to obtain sample portions of said message each of duration t/nw, where t is unit time, and t/nw is greater than the average period of said audio signals in said reduced-time message;

means for increasing the duration of each of said samples by said factor, so as to reduce the average frequency of said signals to its original value; and means for combining the samples of increased duration to form an uninterrupted sequence thereof.

2. Apparatus as defined in claim 1 wherein said means for reducing the time interval comprises:

means for temporarily storing said message;

means for transferring said message to the temporary storage means at an input rate n times the normal rate transmission of said message.

3. Apparatus as defined in claim 2 wherein said means for increasing the duration comprise means for reading said samples out of said temporary storage medium at an output rate 1/ n times said input rate.

4. Apparatus for time-compressing information in a message containing signals within an audio range and produced during a given time interval, said apparatus comprising:

means for increasing the normal rate of delivery of said message by a factor of 11;

means for temporarily storin said message at its increased rate;

means for time-sampling the message in storage;

means for reading the time-samples of said message from storage at a rate reduced by a factor of n so that said samples read out from storage are combined in an uninterrupted sequence.

5. Apparatus as defined in claim 4 wherein said temporary storage means includes a storage medium having a pair of opposite surfaces, and means for writing said message at one of said surfaces, said means for reading being operative at the other of said surfaces.

6. Apparatus as defined in claim 5 wherein:

said storage medium is a phosphor screen;

said means for storin comprises an electron beam,

means for modulating said beam in accordance with said signals, and means for sweeping said beam in a path across a surface of said screen so as to excite the phosphors thereof, and

said means for reading comprises an optical system movable along said path adjacent opposite surface of said screen for focussing light from excited phosphors, and light detecting means for translating light focussed by said system into electrical signals.

7. Apparatus as defined in claim 6 wherein:

said means for time-sampling includes means for alternately forming substantially circular traces of different diameters on said screen with said beam and means for masking at least a portion of each of said traces; and

said optical system including means for focussing to a predetermined position light from the unmasked portion of only one of said traces and means for focussing to said position light from the unmasked portion of one other of said traces.

8. Apparatus as defined in claim 4 wherein the angular velocity of said beam when forming said traces and the velocity of said optical system when moving along said path are synchronized by being locked to a common electrical frequency.

References Cited UNITED STATES PATENTS 2,650,949 9/1953 Veaux. 2,886,650 5/ 1959 Fairbanks et a1. 3,337,800 8/ 1967 Halley.

KATHLEEN H. CLAFFY, Primary Examiner B. P. SMITH, Assistant Examiner 

